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Abstract 


Continuous exponential families may be employed to find continuous distributions with the 
same initial moments as the discrete distributions encountered in typical applications of classical 
equating. These continuous distributions provide distribution functions and quantile functions 
that may be employed in equating. To illustrate, an application is considered for a randomly 
equivalent groups design. 
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1 Introduction 


In common eqnipercentile equating methods such as the percentile rank method or kernel 
equating (von Davier, Holland, & Thayer, 2004), discrete distributions of test scores are 
approximated by continuous distributions with positive density functions on intervals that 
include all possible scores. The approximations used are not entirely satisfactory in terms of the 
relationships of the moments of the approximating distributions and the moments of the original 
distributions. In addition, use of percentile rank typically results in conversion functions that are 
not differentiable at all points, while typical use of the kernel method requires both estimation 
of probabilities by use of log-linear models and smoothing of the resulting distribution function 
by use of a kernel. One method to reduce this difficulty involves use of continuous exponential 
families. With continuous exponential families, a one-step construction of a distribution function is 
provided by a method comparable computationally to use of a log-linear model, and moments are 
fit exactly where desired. This report describes use of continuous exponential families in equating, 
develops appropriate methods for estimation and model evaluation, and compares results to those 
from more conventional approaches to equipercentile equating. 

For simplicity, an equivalent groups design is considered in which Test Forms 1 and 2 are 
compared. Raw scores on Form 1 are integers from c\ to d\ and raw scores on Form 2 are integers 
from C 2 to o? 2 - For j equals 1 or 2, let Xj be a random variable that represents the score on Form 
j of a randomly selected population member, so that Xj has integer values from Cj to dj > Cj. 

To simplify discussion further, assume that, for any integer x from Cj to dj, Xj equals x with 
probability Pj(x) > 0. 

Let Fj denote the distribution function of Xj, so that Fj(x) is the probability that Xj < x, 
and let the quantile function Qj be defined for p in (0,1) as the smallest x such that Fj(x) > p. 
The functions Fj and Qj are non decreasing but not continuous, so that they are not readily 
employed in equating. Instead, equipercentile equating uses continuous random variables Aj 
such that each Aj has a positive density gj on an open interval Bj that includes [cj,dj\, and 
the distribution function Gj of Aj approximates the distribution function Fj. Because the 
distribution function Gj is continuous and strictly increasing, the quantile function Rj of Aj is 
determined by the equation Gj(Rj(p )) = p for p in (0,1), so that Rj is the inverse Gj 1 of Gj. The 
advantage of Rj over Qj is that Rj is strictly monotone and continuous. The equating function 
ei 2 for conversion of a score on Form 1 to a score on Form 2 is then ei2(x) = R 2 (Gi(x)) for x 
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in B i, while the equating function e 2 i for conversion of a score on Form 2 to a score on Form 1 
is e 2 i(x) = Ri(G- 2 (x)) for x in B 2 . Both ei 2 and e 2 i are strictly increasing and continuous on 
their respective ranges, and ei 2 and e 2 i are inverses, so that ei 2 (e 2 i(x)) = x for x in B 2 and 
e2i(ei2(a0) = x f° r x in B\. If gi is continuous at x in B\ and 52 is continuous at ei 2 (x), then 
application of standard results from calculus shows that the following results hold: 

1. At x, the distribution function G± is continuously differentiable and has derivative gi(x). 

2. At e\ 2 {x), R 2 is continuously differentiable and has derivative 1 /( 72 (ei 2 (a?))- 

3. At x, e \2 = R 2 {G\) has derivative e' 12 (x) = gi(x) / g 2 (ei 2 (x)). 

Similarly, if 52 is continuous at x in B 2 and g\ is continuous at e 2 i(.x), then e 2 i has derivative 
e 2 i( x ) = 92 (x)/gi(e 2 i(x)) at x. 

1.1 The Percentile-Rank Method 

In the percentile-rank method, the distribution of X\ and A 2 is approximated with the aid 
of uniformly distributed random variables U\ and U 2 such that JJ\ and X\ are independent and 
U 2 and A 2 are independent. The variables U± and U 2 have range (—1/2,1/2). The approximating 
variable Aj associated with Xj is Xj + Uj. If for real x , [x] is the largest integer not greater 
than x, then a density gj of Aj may be defined so that gj(x) = Pj([x + 1/2]) for real x in 
Bj = (Cj — 1/2, dj + 1/2). For x in Bj, 

Gj(x) = ([x + 1/2] + 1/2 - x)Fj([x - 1/2]) + (x + 1/2 - [x + l/2])F,([x + 1/2]). (1) 

The functions e 2 i and ei 2 are not differentiable at all points in typical situations, for, in typical 
cases, gj is not continuous at integers in [cj,dj\. One added limitation of the percentile rank 
method is that the expected value E(Aj) of Aj is the same as the expected value E(Xj) of Xj , 
but the variance o 2 (Aj) of Aj is o 2 (Xj) + 1/12, a value always greater than the variance a 2 (Xj) 
of Xj. 

1.2 General Kernel Equating 

In general kernel equating, Aj is constructed so that E(Aj) = E(Xj) and cr 2 (Aj) = a 2 (Xj). 
Consider continuous independent random variables Wj with common mean 0 and with respective 
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finite variances a 2 (Wj) > 0. In typical cases, Wj has a normal distribution, but Wj may have a 
logistic or uniform distribution (Lee & von Davier, in press). As in the percentile rank method, 
assume that Wj and Xj are independent, assume that each Wj has positive density Wj on a 
nonempty open interval Cj that includes (—1/2,1/2), and assume that the Wj are independent of 

X\ and X -2 ■ Then the sum Sj = Xj + Wj has a continuous density 

dj 

g Sj {s) = E(wj(s - Xj)) = Y Pj(x)wj{s - x) (2) 

X = Cj 

that is positive on an open interval that includes (Cj — 1/2, dj + 1/2). The variable Sj is the same 
as the variable Aj in section 1.1 if Wj = Uj. The expected value of Sj is E(Sj) = E(Xj), but the 
variance cr 2 (Sj) = a 2 (Xj) + a 2 (Wj) exceeds a 2 (Xj). 

A linear transformation of the variable Sj is used in kernel equating to provide a new 
continuous random variable with the same mean and variance as the original variable Xj. Let 
Cj = a(Xj)/a(Sj), and let Aj = E(Xj) + C/'[S/ — E(Xj)]. Then the expectation E(Aj) = E(Xj), 
and the variance a 2 (A j) = a 2 (Xj). Standard rules for calculation of a density under a linear 
transformation and (2) imply that the random variable Aj is continuous with a density 

dj 

9Wj(x) = Cj 'gsj(E(Xj) + CJ l [x - E{Xj)\) = Cj 1 Y ^K(C“V - E(Xj)} ~[t- E{Xj)\) (3) 

t=Cj 

that is positive for all x such that x = E(Xj) + Cj[t — E(Xj)] + y for some integer t from Cj to dj 
and some y in Cj. The requirement that gwj be positive on an open interval that includes [cj. dj] 
is certainly satisfied if the density Wj is positive on the real line R, so that Cj = R. If Wj has 
cumulative distribution function Hj , then the distribution function of Aj is 

dj 

Gwj(x) = Y MWCi'[x - E(Xj)] -[t- E(Xj)]). (4) 

t=Cj 

If the Wj are continuous and the densities g 3 are positive on [cj,dj\, then the conversion functions 
ei 2 and e 2 i are differentiable. 

The limitation still remains that Aj and Xj need not have any common moments of order 
greater than 2. To be sure, if IL’o is a random variable with mean 0 and all finite moments and 
if Wj = hjWo for positive real hj, then, as hj approaches 0, the moments of Aj converge to the 
moments of Xj] however, this convergence is achieved at a significant cost. As hj approaches 0, 
gwj{x) approaches 0 for x not an integer in [cj, dj], and gwj(x) approaches oo for x an integer in 
[cj, dj]. In typical cases, the derivatives of ei 2 and e 2 i will become very large at some points. 
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1.3 Continuous Exponential Families 

A much more complete solution to the problem of matching moments may be achieved by use 
of continuous exponential families. Let qj and rj be real numbers such that qj < Cj and rj > bj. 
Let Ukj(x), k > 0, be a polynomial of degree k, so that real constants hkk', 0 < k’ < k, k > 0, exist 
such that 

k 

x — ^ ' hkk'jUk'j[x) . 
k'=0 

Let pkj{Xj) be the expectation E(ukj{Xj )) of Ukj{Xj ) for k > 1. Then the kth moment E(Xj ) of 
Xj satisfies 

k 

E(XJ) = £ hkk'j b k/ j (Xj ). 

k'=0 

Let K be some positive integer, and let u Kj(x) be the K -dimensional vector with coordinates 
Ukj(x), 1 < k < K. Let fj, Kk (Xj) be the K -dimensional vector with coordinates p,kj{Xj) for 
1 < k < K. For A'-dimensional vectors x and y with respective coordinates Xk and yk, 1 < k < I \, 
let x ; y be the summation J2k=i x kUk- F° r an y AT-dinrensional vector 6 with coordinates Ok, 

1 < k < K, a density gKj(0) may be defined for x in [qj,rj] so that 

9Kj(x, 0) = 7 Kj{6) exp[6'u Kj (x)\, (5) 

where 

[iKjiO)]- 1 = [ exp [0'u Kj (x)]dx. ( 6 ) 

J qj 

In the case of K = 1, (5) implies that gxj{x,0) is the conditional density of an exponential 
random variable given that the variable having value between qj and rj. For K = 2, gKj{0) is the 
conditional density of a normal random variable given that the variable has value between qj and 
r r 

As in Gilula and Haberman (2000), the quality of the approximation provided by the density 
gKj (•, Q) in (5) may be assessed by use of the expected logarithmic penalty 

H Kj (9) = -E(hgg Kj (X, 6)) = -log lKj (0) - 0'p Kj {Xj). (7) 

The smaller the value of Hxj(9), the better is the approximation. 

Several rationales can be considered for use of the expected logarithmic penalty Hxj(O) 
(Gilula & Haberman, 2000). If Y is a continuous real variable with density /, if g is also a 
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probability density function, and if a penalty of — log g(y) is recorded if Y = y, then the smallest 
expected log penalty E(— log g(Y)) is obtained only if g(Y ) = f(Y) with probability 1. This 
feature in which the penalty is determined by the value of the density at the observed value of Y 
and the expected penalty is minimized by selection of the actual density is only encountered if the 
penalty is of the form a — b\ogg(y) for Y = y for some real constants a and b > 0 such that b > 0. 

This rationale is not applicable to the discrete variables Xj. In general, if Y is discrete, then 
the smallest possible expected log penalty E(— log g(T)) is —oo, for, given any real c > 0, g can be 
chosen so that g(Y) = c with probability 1 and the expected log penalty is — logc. The constant c 
may be arbitrarily large, so that the expected log penalty may be arbitrarily small. Nonetheless, 
the criterion E(— log g(Y)) for a density g cannot be made arbitrarily small if adequate constraints 
are imposed on g. In this section, the requirement that the density function used for prediction 
of Xj belongs to a continuous exponential family suffices to ensure that, in (7), there is a finite 
infimum IKj of the expected log penalty Hxj(0) over all 0. 

A unique A"-dimensional vector 0Rj with coordinates 0Kkji 1 < k < K, exists such that 
(0 Kj ) = Ii<j■ This vector 0Rj is the unique solution of the equations 

r d i 

^kj(0 K ) = / Ukj(x)g K j{x, 0 K )dx = Hkj(Xj), 1 < k < K. (8) 

J Cj 

If i- , Kj{9i<) is the A-dimensional vector with coordinates (0k) for 1 < k < K and /j, K j(Xj) is 
the A-dimensional vector with coordinates fikj{Xj) for 1 < k < K, then v>Kj{0i<) = 
Equivalently, if Vrj is a random variable with range [(jj, rj] with density gxji'- 0 k)-, then 
HkjiYKj) = l- l k](Xj) for all integers from 1 to I\, so that E(Vj^-) = E(Xj) for 1 < k < K. If 
K > 1, then E(VRj) = E(Xj). If K > 2, then cr 2 (VKj) = cr 2 (Xj). If K > 3, then Vrj and Xj have 
the same skewness coefficient. If K > 4, then Vxj and Xj have the same coefficient of kurtosis. 
By (7) and (8), the minimum expected penalty 

I K j = “log 7 Kj(0Kj) - O'KjVKjiXj). (9) 

Corresponding to the density gKj(-,0Kj) in (5) is the cumulative distribution function 

rx 

G Kj (x)= gi<j(v,0Kj)dv (10) 

J qj 

for x between qj and rj. One then has an inverse Rxj such that GKj{RiCj{p )) = P for p in (0,1). 
In equating, a positive integer Kj is selected for each j. Then e\ 2 (x) = Rj< 2 2 (Gk ] i (^')) fo r x in 
(qi,ri) and e 2 i(x) = Rk^Gk 2 2 {x)) for x in (g 2 ,r 2 ). 
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In section 2, estimation of Oxj, Ixj, Gxj, and Rxj is considered for the case of simple 
random sampling. Estimates, large sample approximations for distributions of estimates, and 
estimated asymptotic standard deviations are all provided. 

In section 3, some examples of estimation are provided for some distributions of test scores 
reported in von Davier et al. (2004). In section 4, conclusions are reached concerning the status 
of continuous exponential families in equating. 

In sections 2, 3, and 4, comparisons with alternative equating methods are considered. For 
this purpose, some consideration of expected log penalty for percentile-rank and kernel methods is 
provided in section 1.4 

1.4 Comparisons by Expected Log Penalty 

The proposed approximations may be compared to those from percentile-rank or kernel 
equating. In the percentile-rank case, the expected log penalty 

dj 

!pj = ~E(\ogPj(Xj)) = ~Y Pj( x ) lo g Pj( x ) (H) 

X=Cj 

is the entropy of the discrete variable Xj. In the percentile-rank case with log-linear smoothing 
of order K < dj — Cj , the probabilities pj(x) are approximated by probabilities pxj(x) defined so 
that log pRj(x) is a polynomial in x of order K and the expected penalty 

dj 

IpKj = E(log p K j{Xj )) = y Pj(x)logp Kj (x) > Ipj (12) 

X=Cj 

is minimized subject to this constraint on pxj{x). One has 

PKj (x) = riKj{u K j)exp[u' Kj u Kj (x)\, (13) 

dj 

[rjKji^Kj)} -1 = ex P [ u Kj u Kj(x)], (14) 

X=Cj 

and 

dj 

Y u k(x)PKj(x) = Pkj(Xj), 1 < k < K. (15) 

X=Cj 

Because a polynomial of degree dj — Cj can be found to fit any real function at dj — Cj points, if 
K = dj — Cj, then Ipj<j = h<j and pxj(x) = Pj(x). In general, the equation Ipxj = Ipj holds if, 
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and only if, log pj{x) is a polynomial of order K in terms of x, so that the smoothed probability 
PKj(x ) is equal to the actual probability. 

In kernel equating, (3) implies that the expected log penalty is 

dj 

Iwj = -E{\ogg W j{X)) = -^2 P j(x) log g w j(x). ( 16 ) 

X=Cj 

With log-linear smoothing of order K > 2 , the expected log penalty is 

dj 

IwKj = ~Y lo § 9WKj{x), ( 17 ) 

X=Cj 

where 

dj 

9WKj{t) = q 1 Y, PK^Wj^t - E(X )] - [x - £(*)]). ( 18 ) 

X=Cj 

The expected log penalty can be made arbitrarily small by selection of Wj = hjWo, where hj is 
positive, Wo is a continuous random variable with positive density wq, and Wo is independent of 
X] and X 2 . Because Wj(t) = wo{t/hj)/hj, it follows that, as hj approaches 0 , Q approaches 1 and 
hjgwKj{x) has a lower limit at least equal to pxj{x)wo(0) for each integer x from Cj to dj. Thus 
IwKj approaches — 00 . As evident from the formulas in the introduction for conversions, a very 
large value of gwKj has the danger that the derivative of a conversion will be very large and the 
conversion will be unstable. This issue is customarily treated in kernel equating (von Davier et al., 
2004 , pp. 62 - 64 ). One relatively simple approach based on the expected log penalty is to require 
that gwKj{t) have no more than K — 1 points at which its derivative changes sign, just as gxj(t) 
has a derivative that changes sign no more than K — 1 times. 


2 Estimation of Parameters Under Random Sampling 

Data from random sampling are readily applied to estimation of the parameters 6k for 
K > 1 (Gilula & Haberman, 2000). Recall definitions in section 1.3. For j equals 1 or 2, let 
Xjj, 1 < i < nj, be independent and identically distributed random variables with the same 
distribution as Xj. Let mkj(Xj) be the sample mean 


TYlfrj ( X j ) Tlj ^ ' ttfcj ( X j j ) 


(19) 


i =1 

for k > 1, and let m Kj{Xj) be the It-dimensional vector with coordinates rrikj{Xj ) for 1 < k < K. 
If the Xjj, 1 < i < rij, have at least K distinct values, then 6k j is estimated by the it-dimensional 
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vector Oxj with coordinates 0Kkj> 1 < k < K, where 6 Kj is the unique K -dimensional vector such 
that 

VkjipKj) = m k j{Xj), 1 < k < K. (20) 

Thus (20) corresponds to (8. As the sample size rij approaches oo, 9k j converges to Gxj with 
probability 1, and n - (Oxj ~ @Kj) converges in distribution to a multivariate normal random 
variable with zero mean and with covariance matrix B Kj = C ■ D Kj C ^A- . Here D Kj is the 
covariance matrix of u Kj(Xj) and C Kj is the covariance matrix of the A"-dinrensional vector 
\iKjiVKj)- Thus C Kj = U Rji^Kj), where row k and column k 1 of U Kj(@Kj) is 

r^j 

U K kk'j(0Kj) = / u k j(x)u k f j (x)g K j(x,e Kj )dx - v kj (e Kj )u klj (e Kj ). (21) 

J Cj 

One may estimate W Kj by C Kj = U Kji^Kj), and D Kj nray be estimated by the sample 
covariance matrix D Kj of u x{Xj). Thus B Kj is estimated by B Kj = C^AD^-C^-. The 
estimated asymptotic standard deviation (EASD) of Oxkj is d(pKkj) = (n-J 1 Bxkkj) 1 ^ 2 , where 
Bj\kkj is row k and column k of B Kj- 

The minimum expected penalty Ikj in (9) may be estimated by 

h<j = - log 7 Rji^Kj) - 0 K jm K j(Xj ). ( 22 ) 

The estimate I Kj of (22) has the standard stability property that, as the sample size n increases, 
Ik j converges to Ikj with probability 1 and n- ( Ikj — Ixj) converges in distribution to a normal 
random variable with mean 0 and variance 


a 2 (—loggKj(X, 6Kj)) = [nKj(Xj)]'& K jH K j(Xj). 


Let Pj(x) be the fraction of observations i from 1 to rij with Xij = x, and let OlogO be 0. The 
EASD of I Kj is then 

o{lKj) = (nj x [ m Kj (Xj)]'B Kj m Kj ) 1/2 . (23) 

Equivalently, (23) can be written in terms of the density gxj of (5). One has 

f * . „ | ' 

l °S9Kj(x, e Kj ) - i Kj f V . (24) 


X=Cj 



2.1 Alternative Estimation Methods and Information 

In comparisons with the alternative methods of section 1.1, 1.2, and 1.4, the expected log 
penalty Ipj for the percentile-rank case defined by (11) is estimated by 

dj 

i P j = - Y Pj ( x ) l °SPj( x ), (25) 


and the EASD of Ip 1 is 


1/2 


cr(Ipj) = l rij 1 Y Pj( x )[- l °SPj{x) - iEjf 


(26) 


In the percentile-rank case with log-linear smoothing of order K, the estimated expected log 
penalty that corresponds to IpKj in (12) is 

dj 

IpKj = ~ Y l °SPKj{x), (27) 


where (13), (14), and (15) lead to 


PKj(x) = iiKji&Kj) exp[Lj' Kj u K j(x)] 


and 


1/2 


Y u kj(x)p K j(x) = 1 <k< K. 

X=Cj 

The EASD of I PKj is 

( dj 

d(IpKj) = < nj 1 Y Pji x )[- l °gPKj{x) - iEKjf 

{ x=c i 

In kernel equating with a fixed choice of Wj, Iwj in (16) may be estimated by 

dj 

Iwj = -nj 1 Y 9Wj(x), 

X=Cj 

where the gwj of (3) is estimated by 

dj 

9Wj(t) = Cj 1 Y -Xj)-[x- Xj]), 


(28) 


(29) 


(30) 


(31) 
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Xj is the sample mean rn i ( Xj) of the Xij, 1 < i < rij, a(Xj) is the sample standard deviation of 
the Xj, and Q is estimated by 

C j = fi(x j )/[d 2 (x j ) + o 2 (w j )]V 2 . 


The formula for the EASD of Iwj is somewhat more complicated than in other cases, so it is 
omitted. 

With log-linear smoothing of order K > 2, the expected log penalty IwKj of (17) is estimated 


by 


I\VK] 


where gwi<j i n (18) is estimated by 


dj 

^2 PKj(x) log gwKj(x), 


(32) 


dj 

9WKj(t ) = C" 1 ^2 


Xj]-(x-(X)j}). 


(33) 


The EASD of the estimate Iwi<j in (32) is a bit more complex than in the case of the estimate 
I\Yj in (30), so this formula is also omitted. 


2.2 Equating Functions for Continuous Exponential Families 

The distribution function Gkj in (10) for the continuous exponential family has estimate Grj 
defined by 

nx 

G K j(x)= / gxj{v, 0 K j)dv (34) 

J'h 

for qj < x < rj , and the quantile function Rxj corresponding to Gxj has estimate Rkj defined 
by Gk j(R k j (p)) = P for 0 < p < 1. Standard large-sample arguments imply that, as the sample 
size nj approaches oo, G[<j(x) converges to Gxjix) with probability 1 for qj < x < rj, so that 
\Gkj — GKj | , the supremum of \Gk{x) — Gk(x)\ for qj < x < rj, converges to 0 with probability 1 . 
In addition, [Gxj(x) — Fxj(x)]/a{G k]{x)) converges in distribution to a normal random variable 
with mean 0 and variance 1 if the asymptotic standard deviation of Gxj(x) is 

a(G Kj (x)) = {nj 1 {[T Kj (x)] , B K jT K j(x)} 1 ^ 2 (35) 


and if 


nx 

T Kj(x) = / [u Kj(v) - A i Kj (Xj)]g K j(v,0 K j)dv. 

Jqj 


(36) 
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Similarly, the estimated quantile function Rkj (p) corresponding to the distribution function Gxj in 
(34) converges to the quantile function Rxj{p) with probability 1, and [RKj{p)—RKj{p)\/cr{RKj{p)) 
converges in distribution to a normal random variable with mean 0 and variance 1 if the asymptotic 
standard deviation of Rxjip) is 

v{Rh'j{p)) = nj 1 / 2 [g Kj (R K j(p ))] l v{G K j(Rxj(p)))- (37) 

Estimated asymptotic standard deviations may be derived by use of obvious substitutions of 
estimated parameters for actual parameters. Thus (35) for the asymptotic standard deviation 
R.K'jip) leads to the estimated asymptotic standard deviation 

a(G Kj (x)) = {nf[T Kj {x)^ Kj T Kj {x)y/ 2 , (38) 

where the vector T Kj{x) of (36) is estimated by 

rx 

T Kj (x)= [u K {v) - m K (X j )]g Kj (v,G Kj )dv, (39) 

J qj 

and the (37) for the asymptotic standard deviation of R,Kj(p) leads to the estimated asymptotic 
standard deviation 

°{RKj{p)) = nJ 1 / 2 [g Kj (RKj(p))}^a(G K j(RKj(p)))- (40) 

In the case of equating with constants K\ for X\ and It 2 for X 2 , the conversion function 612 ( 2 ;) 
for conversion of Score x on Form 1 to a score on Form 2 has estimate £12(x) = Rk 2 2(Gk 1 1 ( 2 ;)) for 
qi < x < n, and the conversion function e 2 i(x) for conversion of a Score x on Form 2 to a score 
on Form 1 has estimate £21 (x) = Rk 1 i{Gk 2 2(x)) for q2 < x < V2- As the sample sizes n 1 and 112 
become large, ei 2 (x) converges with probability 1 to ei 2 (x), and e 2 i(x) converges with probability 
1 to e 2 i(x). In addition, (ei 2 — ei 2 )/<r(ei 2 ) converges in distribution to a standard normal random 
variable if the asymptotic standard deviation of the estimated conversion ei 2 (x) is 

<?-(ei 2 (x)) = [cr 2 (G K2 2 (e 12 (x))) + cr 2 (G'K 1 i(x))] 1 ' 2 /ffA' 2 2 (ei 2 (x)). (41) 

Given (41), it follows that the EASD of the estimated conversion ei 2 (x) is 

d(e 12 (x)) = [<5- 2 (GA' 2 2(ei2(x))) + d 2 (G Kl i{x))} 1/2 / g K22 (e 12 (x))) (42) 

The case of the conversion function e 2 i from Form 2 to Form 1 is treated in a similar fashion. 
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2.3 Other Equating Functions 

The large-sample results for equating based on continuous exponential families are a bit 
simpler than those for percentile ranks or for the kernel method; however, the two cases differ 
somewhat. 

In the case of percentile ranks, for x in the open interval Bj = ( Cj — 1/2, dj + 1/2), the 
continuous distribution function Gj in (1) may be estimated by 

Gj{x) = {[x + 1/2] + 1/2 - x)F j {[x - 1/2]) + {x + 1/2 - [x + 1/2 })Fj{[x + 1/2]), (43) 

where Fj is the empirical distribution function of Fj. With probability 1, | Gj — Gj\ converges to 
0 as rij becomes large. If, for 0 < p < 1, the estimated quantile function Rj corresponding to Gj 
satisfies Gj(Rj(p )) = p and the quantile function Rj corresponding to Gj satisfies Gj(Rj(p)) = p, 
then the estimated quantile function Rj(p) converges to the quantile function Rj(p) with 
probability 1, so that the estimated conversion function <=12 (.x) = R, 2 {G\{x)) for conversion of 
Score x on Form 1 to a score on Form 2 converges to the corresponding convergence function 
e\ 2 (x) = R 2 (G 1 (x)) for x in B\. Results for asymptotic normality are not entirely satisfactory, for 
a case with e\ 2 (x) — 1/2 equal to an integer typically results in no normal approximation for the 
distribution of the estimated conversion function <212 (x). Similar issues arise for the percentile-rank 
method with log-linear smoothing. 

Asymptotic results are available for kernel equating (von Davier et al., 2004) that are 
comparable to those for continuous exponential families. When kernel equating is applied with 
log-linear smoothing, the results are a bit more complicated than for continuous exponential 
families due to the use of both smoothing of frequencies and conversion to a continuous distribution 
by use of the kernel approach. 

2.4 Computational Issues 

Given a starting value OkjOi the Newton-Raphson algorithm may be employed to compute 
Ok j in (20). At step t > 0, a new approximation 0 K j(t+i) of Okj is found by the equation 

0Kj(t+ 1 ) = #Kjt + [^JKj(0 K jt)]~ 1 [m K j{Xj) - n K j{0 K jt)\- (44) 

Recall that the elements of the K by K matrix U Kj are defined in (21), the elements of the 
K -dimensional vector m Kj(Xj) are defined in (19), and the elements of the A'-dimensional vector 
function 1 'Kj are defined in (8). 
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In practice, numerical work is simplified if computations employ Legendre polynomials 
(Abramowitz & Stegun, 1965, chapters 8 , 22). The Legendre polynomial of degree 0 is Pq{x) = 1, 
the Legendre polynomial of degree 1 is P\(x) = x , and the Legendre polynomial P k + i(x) of degree 
k + 1 , k > 1 , is determined by the recurrence relationship 


Pk+i(x ) = (k + 1) [(2 k + 1 )xP k (x) - kPk-i(x)}, 


(45) 


so that P 2 (x) = (3x 2 — l)/2. These polynomials satisfy the relationships 


I ^ Pj(x)P k (x)dx = s jk 2k + l 


(46) 


for nonnegative integers j and k, where the Kronecker delta 5j k is 1 for j = k and 0 otherwise. It 
is relatively efficient for numerical work to let u k j(x) = P k ((2x — qj — rj)/{rj — qj)), for then the 
K by K matrix Uxj(Ox) defined in (21) is a diagonal matrix, where 0 k is the JL-dimensional 
vector with all coordinates 0. The obvious choice u k j{x ) = x k is avoided because this case often 
leads to poor conditioning of the matrix U Kj{ 6 )- The Legendre polynomials also form the basis 
for the Gaussian quadratures required for evaluation of the integrals from Cj to dj that are needed 
in numerical work (Abramowitz & Stegun, 1965, p. 887). In this paper, calculations use 8 -point 
Gaussian quadrature. 


3 Example 

Table 7.1 of von Davier et al. (2004) provides two distributions of test scores that are integers 
from Cj = 0 to dj = 20. To illustrate results, the case of qj = —0.5 and rj = 20.5 is considered for 
K from 2 to 4 and for the Legendre polynomial case with u k j(x) = P k ((2x — qj — rj)/{rj — qj)) for 
P k defined as in (45). Results for parameters are summarized in Tables 1 and 2. Results in terms 
of estimated expected log penalties are summarized in Table 3. These tables suggest that gains 
over the quadratic case (K = 2 ) are very modest for both Ai and X 2 , although some evidence 
exists that, for both variables, the parameters %,• 3 , # 4 ^ 3 , and 6^4 of ( 8 ) are nonzero. Estimated 
parameters in Tables 1 and 2 are computed as in section 2.4. The estimated asymptotic standard 
deviations are found as in section 2. In Table 3, (22) and (23) are employed to obtain estimates. 

Use of equipercentile equating leads to somewhat similar results. For X\, the estimated 
expected log penalty Ip\ in (25) is 2.741, and the EASD from (26) is 0.015. For X 2 , Ip 2 is 2.765, 
and the EASD is 0.014. For the case of K = 2, for X\, the estimated expected log penalty Ip 2 \ 
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Table 1 


Parameters for Variable X\ 


Parameter 

Estimate 

EASD 

#21 

0.590 

0.074 

#22 

-2.364 

0.097 

#31 

0.701 

0.100 

#32 

-2.415 

0.103 

#33 

0.172 

0.112 

#41 

0.792 

0.124 

#42 

-2.681 

0.172 

#43 

0.294 

0.140 

#44 

-0.322 

0.150 


Note. EASD = estimated asymptotic standard deviation. 


Table 2 


Parameters for Variable X 2 


Parameter 

Estimate 

EASD 

#21 

1.059 

0.076 

#22 

-2.105 

0.094 

#31 

1.212 

0.110 

#32 

-2.224 

0.117 

#33 

0.231 

0.112 

#41 

1.287 

0.137 

#42 

-2.372 

0.172 

#43 

0.338 

0.150 

#44 

-0.173 

0.132 


Note. EASD = estimated asymptotic standard deviation. 
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Table 3 


Estimated Expected Log Penalties for Variables X\ and X 2 


Variable 

Degree 

Estimate 

EASD 

Ai 

2 

2.747 

0.015 

Ai 

3 

2.747 

0.015 

Ai 

4 

2.745 

0.015 

a 2 

2 

2.773 

0.014 

a 2 

3 

2.772 

0.014 

a 2 

4 

2.771 

0.014 


Note. EASD = estimated asymptotic standard deviation. 

from (27) is 2.748, and the EASD from (29) is 0.015. For X 2 , Ip 22 is 2.773, and the EASD is 
0.014. As one illustration of results for kernel equating, consider the case of W±, a normal random 
variable with mean 0 and standard deviation 0.622; W 2 , a normal random variable with mean 
0 and standard deviation 1.367; and K = 2 for both X\ and X 2 (von Davier et al., 2004, p. 
106). Computations for the kernel method are described in the user guide for Version 2.1 of the 
LOGLIN/KE program (Chen, Yan, Han, & von Davier, 2006). For X\, the estimated expected 
penalty I\y 21 from (32) is 2.748. For X 2 , Iw 22 is 2.779. In both cases, the kernel density has a 
derivative that only changes sign at K — 1 = 1 points, so that the criterion of section 1.4 for the 
kernel density is satisfied. At least for the example under study, it appears that the equipercentile, 
kernel, and continuous exponential family approaches lead to comparable results in terms of 
compatibility with the data. 

Equating results may now be considered. The case of the conversion ei 2 from Form 1 to 
Form 2 will be examined for the cases under study. Results are provided in Table 4. They employ 
formulas developed in sections 2.2 and 2.3. In kernel and percentile-rank equating, log-linear 
smoothing is used with the constant K equal to 2 for each variable. For continuous exponential 
families, K\ = K 2 = 2. These results are an illustration of one of a very large number of 
possibilities. In this example, the three conversions are very similar for all possible values of X\. 
For the two methods for which estimated asymptotic standard deviations are available, results 
are rather similar. The results for the continuous exponential family are relatively best at the 
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extremes of the distribution. 


Table 4 

Comparison of Conversions From Form 1 to Form 2 


Value 

Continuous exponential 
Estimate EASD 

Kernel 

Estimate EASD 

Percentile rank 

Estimate 

0 

0.091 

0.110 

-0.061 

0.194 

0.095 

1 

1.215 

0.209 

1.234 

0.235 

1.179 

2 

2.304 

0.239 

2.343 

0.253 

2.255 

3 

3.377 

0.240 

3.413 

0.253 

3.325 

4 

4.442 

0.230 

4.473 

0.242 

4.392 

5 

5.504 

0.214 

5.529 

0.225 

5.458 

6 

6.564 

0.198 

6.582 

0.207 

6.522 

7 

7.621 

0.182 

7.634 

0.189 

7.585 

8 

8.677 

0.169 

8.685 

0.174 

8.647 

9 

9.732 

0.159 

9.734 

0.162 

9.706 

10 

10.784 

0.155 

10.781 

0.155 

10.761 

11 

11.834 

0.155 

11.825 

0.153 

11.823 

12 

12.880 

0.160 

12.865 

0.156 

12.859 

13 

13.919 

0.168 

13.900 

0.163 

13.898 

14 

14.950 

0.177 

14.925 

0.172 

14.928 

15 

15.966 

0.184 

15.936 

0.179 

15.947 

16 

16.959 

0.187 

16.925 

0.182 

16.949 

17 

17.912 

0.179 

17.879 

0.178 

17.927 

18 

18.802 

0.156 

18.799 

0.164 

18.871 

19 

19.592 

0.109 

19.723 

0.145 

19.760 

20 

20.240 

0.040 

20.818 

0.119 

20.380 


Note. EASD = estimated asymptotic standard deviation. 


4 Conclusions 

Results in this report suggest that equating via continuous exponential families can be 
regarded as a viable competitor to kernel equating. Continuous exponential families lead to 
simpler procedures and more thorough moment agreement, for fewer steps are involved in equating 
by continuous exponential families due to elimination of kernel smoothing. In addition, equating 
by continuous exponential families does not require selection of band widths. 

One example does not produce an operational method, and kernel equating is rapidly 
approaching operational use, so it is important to consider some required steps. 

Although equivalent-group designs are used in operations both at ETS and elsewhere, a 
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large fraction of equating designs are more complex and require at least treatment of bivariate 
distributions. For this purpose, continuous exponential families can be employed, for continuous 
exponential families can be applied to multivariate distributions. This issue is expected to be the 
subject of future work. No reason exists to expect that continuous exponential families cannot be 
applied to any standard equating situation to which kernel equating has been applied. 

It is certainly appropriate to consider a variety of applications to data, and some work on 
quality of large-sample approximations is appropriate when smaller sample sizes are contemplated. 
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